line break confusion

Carl Banks idot at vt.edu
Sat Feb 16 00:26:33 EST 2002


Michael Mell wrote:
> Sean 'Shaleh' Perry wrote:
> 
>> Windows: \r\n
>> Linux (UNIX): \n
>> Mac: \r
>>
>> Each OS's libs transform the sequence '\n' to whatever the right
>> thing is for them.
>
> Thanks for answering such an apparently ignorant question, but the
> answer is not so simple.
> 
> I want to read and write files with line breaks which are not of the
> native platform type.
> 

> Problem #1 is that on a Mac, the values of Mac and Unix are
> /swapped/ and therefore wrong. Writing a file using '\n' results in
> a /Mac/ line break (as reported by BBEdit, the venerable Mac
> editor). Writing a file using '\r' results in a /Unix/ file. The Mac
> seems unable to write Windows \r\n no matter what order is used.


You need to open the file in binary mode:

open(file,"rb")
open(file,"wb")

By default, Python opens the file in text mode, which helpfully
converts your line feeds into whatever the convention of your platform
is.  Helpfully, that is, unless you want to ignore your platform's
conventions and choose your own line delimiters.


> Problem #2 is that when reading a Windows file on Linux or Mac, a
> /single/ line ('\r\n') is interpreted as /two/ lines ('\n\n'). Files
> written out on Linux then have double the correct number of lines.

It can be worked around.  The characters won't be converted in binary
mode, so you can see whether the line ends in '\n' or '\r' and throw
out the superfluous empty '\r' line.


-- 
CARL BANKS                                http://www.aerojockey.com
"Nullum mihi placet tamquam provocatio magna.  Hoc ex eis non est."



More information about the Python-list mailing list