[Python-3000] io library/PEP 3116 bits

Guido van Rossum guido at python.org
Mon Jul 30 19:20:50 CEST 2007


On 7/30/07, skip at pobox.com <skip at pobox.com> wrote:
> I was looking at PEP 3116 to try and figure out what the newline keyword
> argument was for (it was mentioned in a couple replies to some checkin
> comments and I see it in io.py).  It's not really mentioned in the PEP as
> far as I could tell other than this:
>
>     Some new features include universal newlines and character set encoding
>     and decoding.
>
> The io.open() docstring has this to say:
>
>       newline: optional newlines specifier; must be None, '\n' or '\r\n';
>                specifies the line ending expected on input and written on
>                output.  If None, use universal newlines on input and
>                use os.linesep on output.
>
> Shouldn't '\r' be provided as an option for Macs?  Also, shouldn't the "U"
> mode flag be discarded (2to3 could maybe do this)?  Is this particular bit
> of backwards compatibility all that necessary?

I don't think \r needs to be supported -- OSX uses \n; Python 3.0
isn't going to be ported to MacOS 9. We discussed this before; I
promised I'd add \r support if anyone can find a current use case for
it. So far none have been reported.

Regarding dropping 'U': agreed. But since the fixer hasn't been
written yet it hasn't been dropped yet. We need help for little
niggling details like this!

> The other thing I wanted to comment on is the default value for n in the
> various read methods.  In some places it's -1 (why not zero? *), but in
> other places it's None, with presumably the same meaning.  Shouldn't this be
> consistent across all read methods?  The couple read methods mentioned in
> PEP 3116 only mention n=-1 as a default.
>
> Skip
>
> (*) A few days ago at work I saw someone check in a piece of code with
>
>     f.read(-1)
>
> That looked so strange to me I had to look up its meaning.  I don't think I
> had ever seen someone explicitly call read with a -1 arg.

read(0) means to read zero bytes. It always returns an empty string
(or byte array). There are plenty of end cases where this is useful.

read(), read(None) and read(-1) are all synonyms, meaning "read until
EOF". The reason there are three spellings is mostly historic; because
there are so many different file-like objects and not all of them
implemented this consistently. Since the argument is an integer, it's
the easiest to use -1 as the default; but since some classes used None
as the default instead, some people started *passing* None, and then
the need was born to support both.

Arguably this was a bad idea, and we should add a new API readall()
(one of the implementations already has this, and read(-1) calls it).
Then the 2to3 fixer will have to recognize this. I welcome patches!

But right now, getting the number of failing unit tests in the
py3k-struni branch down to zero is more important. To help, see
http://wiki.python.org/moin/Py3kStrUniTests.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)


More information about the Python-3000 mailing list