[Python-3000] [Python-Dev] Universal newlines support in Python 3.0

Barry Warsaw barry at python.org
Wed Aug 15 05:44:26 CEST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Aug 14, 2007, at 12:52 PM, Guido van Rossum wrote:

> On 8/14/07, Barry Warsaw <barry at python.org> wrote:
>> It would have been perfect, I think, if I could have opened the file
>> in text mode so that read() gave me strings, with universal newlines
>> and preservation of line endings (i.e. no translation to \n).
>
> You can do that already, by passing newline="\n" to the open()
> function when using text mode.

Cute, but obscure.  I'm not sure I like it as the ultimate way of  
spelling these semantics.

> Try this script for a demo:
>
> f = open("@", "wb")
> f.write("bare nl\n"
>         "crlf\r\n"
>         "bare nl\n"
>         "crlf\r\n")
> f.close()
>
> f = open("@", "r")  # default, universal newlines mode
> print(f.readlines())
> f.close()
>
> f = open("@", "r", newline="\n")  # recognize only \n as newline
> print(f.readlines())
> f.close()
>
> This outputs:
>
> ['bare nl\n', 'crlf\n', 'bare nl\n', 'crlf\n']
> ['bare nl\n', 'crlf\r\n', 'bare nl\n', 'crlf\r\n']
>
> Now, this doesn't support bare \r as line terminator, but I doubt you
> care much about that (unless you want to port the email package to Mac
> OS 9 :-).

Naw, I don't, though someday we'll get just such a file and a bug  
report about busted line endings ;).

There's still a problem though: this works for .readlines() but not  
for .read() which unconditionally converts \r\n to \n.  The  
FeedParser uses .read() and I think the behavior should be the same  
for both methods.

- -Barry

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.7 (Darwin)

iQCVAwUBRsJ2mnEjvBPtnXfVAQIL8AP/YhVUAoR9yWMniTUls5thI4ubUmPJlln4
R2cDOCw97lsYEDBk80bS2d/ZgncG5EnleIBmg+UtkEoSduhTOLZjot3cgmfy1DqX
LHFfUCe8AnHLjuZBV7RbOcpn14X8fGtqNkYq25yvyOIvIYdIBP64ZjbyFD+kZhTA
Ss8e10D+YJw=
=otBw
-----END PGP SIGNATURE-----


More information about the Python-3000 mailing list