What is the best way to handle a missing newline in the following case

Peter Otten __peter__ at web.de
Fri Nov 5 11:32:57 EDT 2010


chad wrote:

> I have an text file with the following numbers
> 
> 1
> 3
> 5
> 7
> 3
> 9
> 
> 
> 
> Now the program reads in this file. When it encounters a '\n', it will
> do some stuff and then move to the next line. For example, it will
> read 1 and then '\n'. When it sees '\n', it will do some stuff and go
> on to read 3.
> 
> The problem is when I get to the last line. When the program sees '\n'
> after the 9, everything works fine. However, when there isn't a '\n',
> the program doesn't process the last line.
> 
> What would be the best approach to handle the case of the possible
> missing '\n' at the end of the file?

Don't split the data into lines yourself, delegate to python

>>> with open("tmp.txt", "wb") as f:
...     f.write("1\n2\n3\r\n4\r5")
...
>>> for line in open("tmp.txt", "U"):
...     print repr(line)
...
'1\n'
'2\n'
'3\n'
'4\n'
'5'

As you can see "\n", "\r\n" and "\r" are all converted to "\n". This is 
called universal newline mode and enabled by open(..., "U"). If your client 
code insists that a line has to end with "\n" and there's no way to change 
that you can wrap the file

>>> def ensure_newline(instream):
...     prev = next(instream)
...     for cur in instream:
...             yield prev
...             prev = cur
...     if not prev.endswith("\n"):
...             prev += "\n"
...     yield prev
...
>>> for line in ensure_newline(open("tmp.txt", "U")):
...     print repr(line)
...
'1\n'
'2\n'
'3\n'
'4\n'
'5\n'

But often the opposite direction, removing any newlines, works just as well 
and is easier to achieve:

>>> for line in open("tmp.txt", "U"):
...     print repr(line.strip("\n"))
...
'1'
'2'
'3'
'4'
'5'

Peter



More information about the Python-list mailing list