[Python-Dev] New lines, carriage returns, and Windows

Dino Viehland dinov at exchange.microsoft.com
Wed Sep 26 22:23:58 CEST 2007


We ran into an interesting user-reported issue w/ IronPython and the way Python writes to files and I thought I'd get python-dev's opinion.

When writing a string in text mode that contains \r\n we both write \r\r\n because the default write mode is to replace \n with \r\n.  This works great as long as you stay within an entirely Python world.  Because Python uses \n for everything internally you'll never end up writing out a \r\n that gets transformed into a \r\r\n.  But when interoperating with other native code (or .NET code in our case) it's fairly easy to be exposed to a string which contains \r\n.  Ultimately we see odd behavior when round tripping the contents of a multi-line text box through a file.

So today users have to be aware of the fact that Python internally always uses \n.  They also need to be aware of any APIs that they call that might return a string with an embedded \r\n inside of them and transform the string back into the Python version.

It could be argued that there's little value in doing the simple transformation from \r\n -> \r\r\n.  Ultimately that creates a file that has line endings which aren't good on any platform.  On the other hand it could also be argued that Python defines new-lines as \n and there should be no deviation from that.  And doing so could be considered a slippery slope, first file deals with it, and next the standard libraries, etc...  Finally this might break some apps and if we changed IronPython to behave differently we could introduce incompatibilities which we don't want.

So I'm curious: Is there a reason this behavior is useful that I'm missing?  Would there be a possibility (or objections to) making \r\n be untransformed in the Py3k timeframe?  Or should we just tell our users to open files in binary mode? :)


More information about the Python-Dev mailing list