[Python-Dev] [Python-3000] Universal newlines support in Python 3.0
Guido van Rossum
guido at python.org
Sat Aug 11 19:29:38 CEST 2007
On 8/11/07, Tony Lownds <tony at pagedna.com> wrote:
> On Aug 10, 2007, at 11:23 AM, Guido van Rossum wrote:
> > Python 3.0 currently has limited universal newlines support: by
> > default, \r\n is translated into \n for text files, but this can be
> > controlled by the newline= keyword parameter. For details on how, see
> > PEP 3116. The PEP prescribes that a lone \r must also be translated,
> > though this hasn't been implemented yet (any volunteers?).
> I'm working on this, but now I'm not sure how the file is supposed to
> be read when
> the newline parameter is \r or \r\n. Here's the PEP language:
> buffer is a reference to the BufferedIOBase object to be wrapped
> with the TextIOWrapper.
> encoding refers to an encoding to be used for translating between
> the byte-representation
> and character-representation. If it is None, then the system's
> locale setting will be used
> as the default. newline can be None, '\n', '\r', or '\r\n' (all
> other values are illegal);
> it indicates the translation for '\n' characters written. If None,
> a system-specific default
> is chosen, i.e., '\r\n' on Windows and '\n' on Unix/Linux. Setting
> newline='\n' on input
> means that no CRLF translation is done; lines ending in '\r\n'
> will be returned as '\r\n'.
> ('\r' support is still needed for some OSX applications that
> produce files using '\r' line
> endings; Excel (when exporting to text) and Adobe Illustrator EPS
> files are the most common examples.
> Is this ok: when newline='\r\n' or newline='\r' is passed, only that
> string is used to determine
> the end of lines. No translation to '\n' is done.
I *think* it would be more useful if it always returned lines ending
in \n (not \r\n or \r). Wouldn't it? Although this is not how it
currently behaves; when you set newline='\r\n', it returns the \r\n
unchanged, so it would make sense to do this too when newline='\r'.
Caveat user I guess.
> > However, the old universal newlines feature also set an attibute named
> > 'newlines' on the file object to a tuple of up to three elements
> > giving the actual line endings that were observed on the file so far
> > (\r, \n, or \r\n). This feature is not in PEP 3116, and it is not
> > implemented. I'm tempted to kill it. Does anyone have a use case for
> > this? Has anyone even ever used this?
> This strikes me as a pragmatic feature, making it easy to read a file
> and write back the same line ending. I can include in patch.
OK, if you think you can, that's good. It's not always sufficient (not
if there was a mix of line endings) but it's a start.
--Guido van Rossum (home page: http://www.python.org/~guido/)
More information about the Python-Dev