Beginner question : skips every second line in file when usingreadline()

Paul Watson pwatson at redlinec.com
Mon Oct 20 12:20:22 EDT 2003


"Pettersen, Bjorn S" <BjornPettersen at fairisaac.com> wrote in message
news:mailman.233.1066625775.2192.python-list at python.org...
> From: Hans Nowak [mailto:hans at zephyrfalcon.org]
>
> peter leonard wrote:
>
> > Hi,
> > I having a problem with reading each line from a text file.
[...]
> >
> > The following script attempts to print out each line :
> >
> > datafile ="C:\\Classifier\Data\\test.txt"
> > dataobject = open(datafile,"r")
> >
> > while dataobject.readline() !="":
> >
> >        line = dataobject.readline()
> >        print line
>
> You're calling readline() twice.  Use something like:
>
> line = dataobject.readline()
> while <line is not empty>:
>      ...do stuff...
>      line = dataobject.readline()
>
> or:
>
> for line in dataobject:
>      if <line is empty>:
>          break
>      ...do stuff...
>
> I'm writing <line is empty>, because you might want to revise
> dataobject.readline() != "" as well.  If you read from a text
> file, lines will end in \n, so comparing them to "" will always
> return false.
[..]

To back up a couple of steps... (it looks like you're coming from a
C/C++/Java background <wink>).

In Python reading from a file (by either read or readline) always
returns _something_, unless the end of the file is reached. This works
because readline doesn't throw away the newline at the end of a line.
I.e. if you're reading an empty line in a text file, readline() returns
the string '\n' (a one character string). It also gives a convenient way
of testing for the end of the file, e.g. if you look at Python code
that's a little older you'll find this use idiomatic:

  while 1:
      line = fp.readline()
      if not line:
          break
      ..do stuff..

note that the empty line tests false (as well as all other empty objects
in Python), and the above is considered much better (re. style as well
as flexibility) than:

  if line <> '':

or the harder to type:

  if line != "":

Assuming that your version of Python is more recent, you can now iterate
over file objects using a for loop without having to deal with end of
file conditions. A common idiom is:

  for line in file(datafile):
      ..do stuff..

(file being the preferred way of spelling open.. at least officially ;-)
This takes advantage of both

 - the default mode for opening files is for reading ('r'),
   so it doesn't need to be specified.
 - the file is automatically closed at the end of the for loop.

Automatic closing is "implementation defined" behavior -- i.e. it won't
ever change in CPython (the C implementation), but doesn't work this way
in Jython (the Java implementation). Some people argue that you should
always close files explicitly, like you would in Jython (and most other
programming languages):

  df = file(datafile)
  for line in df:
      ..do something..
  df.close()

personally I just find that obfuscated <grin>.

hth,
-- bjorn


Does this cause the entire input file to be read into memory before the for
loop begins execution?

This is great for reading 5 lines, but I might need to read 30 million lines
from a mortgage company file.  I cannot read the entire file into memory.






More information about the Python-list mailing list