[Tutor] File Access

Danny Yoo dyoo at hkn.eecs.berkeley.edu
Mon Apr 5 04:33:46 EDT 2004



On Sun, 4 Apr 2004, Nick Lunt wrote:

> I could probably do a realines() until EOF overwriting a variable with
> the contents of each line so that at the end the variable will hold the
> contents of just the last line, or by putting each line of the file into
> a list and pulling the last element of the list out, but eventually the
> file.txt will get pretty big, so I thought of using seek().

Hi Nick,


Yes, that's one approach.

If you are going to read a large file, definitely try to avoid using
readlines(), because that tries to load the whole file into memory at
once.  Usually, we don't need to read the whole file at once: things often
work if we read and process a line at a time, to avoid holding the whole
file at once.

So instead of:

###
for line in somefile.readlines():     ## First approach
    ...
###

it's more efficient to say:

###
while 1:                              ## Second approach
    line = somefile.readline()
    if not line: break
    ...
###


This second appraoch is a little ugly, even though it scales better than
the first.  Thankfully, there's a way out: recent versions of Python let
us write a line-by-line approach by taking advantage of new features of
the for loop:

###
for line in somefile:                 ## Third approach
    ...
###

Here, we "iterate" across a file, just as if we were iterating across a
list.  This third approach looks just as nice as the first 'buggy'
approach, but with the low memory requirements of the second.  So it's the
best of both worlds.  *grin*


> However, I could not find on python.org or by googling any way to do
> this.

Try iterating across the whole loop, and keep track of the very last line
that's read.  That should be the line you're looking for.


If you plan to do this sort of thing a lot, it might be worth it to build
an "index" that records the byte offsets of each line.  We can talk about
that if you'd like.


If you have any questions on this, please feel free to ask.  Good luck to
you!





More information about the Tutor mailing list