file.tell() ?

Jeff Epler jepler at unpythonic.net
Sun Mar 21 16:25:57 EST 2004


When using a file as an iterator, multiple lines are read at a time.
If you have a long file, not all the lines will be read at once.
When I wrote xreadlines (for python 2.1 or 2.2) it was defined in terms of
readlines(SIZEHINT), but I am no longer familiar with the implementation.
I don't think the exact details are documented anywhere, or guaranteed
not to change between releases.

I wrote a small program to read all the lines in /usr/share/dict/words
and keep a record of all the positions returned by tell().  Here are the
results:
    [jepler at parrot jepler]$ cat /tmp/mcavoy.py 
    d = {}
    f = file("/usr/share/dict/words")
    for l in f:
        d[f.tell()] = None
    dk = d.keys()
    dk.sort()
    print dk
    print dk[-1] * 1.0 / len(dk) # Average block size

    [jepler at parrot jepler]$ python /tmp/mcavoy.py 
    [8196L, 16393L, 24596L, 32793L, 40994L, 49186L, 57379L, 65576L,
    73776L, 81972L, 90167L, 98362L, 106557L, 114750L, 122945L, 131137L,
    139332L, 147532L, 155729L, 163922L, 172119L, 180316L, 188515L,
    196710L, 204910L, 213105L, 221306L, 229505L, 237697L, 245896L,
    254088L, 262291L, 270486L, 278688L, 286893L, 295092L, 303288L,
    311488L, 319687L, 327884L, 336082L, 344277L, 352474L, 360675L,
    368869L, 377061L, 385261L, 393459L, 401656L, 409305L]
    8186.1

As you can see, my Python reads about 8K at a time, which is a perfectly
reasonable amount on any machine I still use.

Jeff




More information about the Python-list mailing list