[Python-ideas] Iterating non-newline-separated files should be easier
Andrew Barnert
abarnert at yahoo.com
Sat Jul 26 06:03:30 CEST 2014
On Jul 25, 2014, at 19:24, Akira Li <4kir4.1i at gmail.com> wrote:
> Nick Coghlan <ncoghlan at gmail.com> writes:
>
>> On 26 Jul 2014 04:33, "Andrew Barnert"
>> <abarnert at yahoo.com.dmarc.invalid>
>> wrote:
>>> As I've said before, I don't really like the design for '\r' and '\r\n',
>> or the fact that three separate notions (universal-newlines flag, line
>> ending for readline, and output translation for write) are all conflated
>> into one idea and crammed into one parameter, but I think it's probably too
>> late and too radical to change that.
>>
>> It's potentially still worth spelling out that idea as a Rejected
>> Alternative in the PEP. A draft design that separates them may help clarify
>> the concepts being conflated more effectively than simply describing them,
>> even if your own pragmatic assessment is "too much pain for not enough
>> gain".
>
> It can't be in the rejected ideas because it is the current behavior for
> io.TextIOWrapper(newline=..) and it will never change (in Python 3) due
> to backward compatibility.
That's exactly why changing it would be a "rejected idea". It certainly doesn't hurt to document the fact that we thought about it and decided not to change it for backward compatibility reasons.
> As I understand Andrew doesn't like that *newline* parameter does too
> much:
>
> - *newline* parameter turns on/off universal newline mode
> - it may specify the line separator e.g., newline='\r'
> - it specifies whether newline translation happens e.g., newline=''
> turns it off
> - together with *line_buffering*, it may enable flush() if newline is
> written
Exactly. And the fourth one only indirectly; "newline" flushing doesn't exactly mean _either_ of "\n" or the newline argument. And the related-but-definitely-not-the-same newlines attribute makes it even more confusing. (I've found bug reports with both Guido and Nick confused into thinking that newline was available as an attribute after construction; what hope do the rest of us have?)
But the reality is, it rarely affects real-life programs, so it's definitely not worth breaking compatibility over. And it's still a whole lot cleaner than the 2.x design despite having a lot more details to deal with.
More information about the Python-ideas
mailing list