strange python scripting error

Peter Otten __peter__ at web.de
Thu Jul 23 18:49:47 EDT 2009


Mark Tarver wrote:

> On 23 July, 18:01, Dennis Lee Bieber <wlfr... at ix.netcom.com> wrote:
>> On Thu, 23 Jul 2009 08:48:46 -0700 (PDT), Mark Tarver
>> <dr.mtar... at ukonline.co.uk> declaimed the following in
>> gmane.comp.python.general:
>>
>> > The only hint at a difference I can see is that my ftp program says
>> > the files are of unequal lengths.  test.py is 129 bytes long.
>> > python.py 134 bytes long.
>>
>> Just a guess...
>>
>> Line endings... <lf> vs <cr><lf>
>>
>> --
>> Wulfraed        Dennis Lee Bieber               KD6MOG
>> wlfr... at ix.netcom.com             wulfr... at bestiaria.com
>> HTTP://wlfraed.home.netcom.com/
>> (Bestiaria Support Staff:               web-a... at bestiaria.com)
>> HTTP://www.bestiaria.com/
> 
> Is that linefeed + ctrl or what?  I can't pick up any difference
> reading the files char by char in Lisp.  How do you find the
> difference?

carriage-return + linefeed

That's the Windows convention for end-of-line markers. Unix uses linefeed 
only.

If you are on Windows you have to open the file in binary mode to see the 
difference:

>>> open("python.py", "rb").read()
'#!/usr/bin/python\r\nprint "Content-type: 
text/html"\r\nprint\r\nprint"<html>"\r\nprint "<center>Hello, 
Linux.com!</center>"\r\nprint "</html>"'
>>> open("test.py", "rb").read()
'#!/usr/bin/python\nprint "Content-type: text/html"\nprint\nprint 
"<html>"\nprint "<center>Hello, Linux.com!</center>"\nprint "</html>"'
>>>

\n denotes a newline (chr(10))
\r denotes a cr (chr(13))

You can fix the line endings with

open(outfile, "wb").writelines(open(infile, "rU"))

Here "U" denotes "universal newline" mode which recognizes "\r", "\n", and 
"\r\n" as line endings and translates all of these into "\n". "wb" means 
"write binary" which does no conversion. Therefore you'll end up with 
outfile following the Unix convention.

Peter




More information about the Python-list mailing list