On Jul 25, 2014, at 19:13, Akira Li <4kir4.1i@gmail.com> wrote:
I've added a patch that demonstrates "no translation" for alternative newlines behavior http://bugs.python.org/issue1152248#msg224016
Having taken a better look at the line buffering code, I now agree with you that this is necessary; otherwise we'd have to make a much bigger change to the implementation (which I don't think we want). When I update the draft PEP I'll change that and add a rationale (this also makes the rationale for "no translation for binary files" and for "only readnl is exposed, not writenl" a lot simpler). I'll also change it in my C patch (which I hope to be able to clean up and upload this weekend).
Andrew Barnert <abarnert@yahoo.com.dmarc.invalid> writes:
On Thursday, July 24, 2014 2:08 AM, Akira Li <4kir4.1i@gmail.com> wrote:
Andrew Barnert <abarnert@yahoo.com> writes:
On Jul 23, 2014, at 5:13, Akira Li <4kir4.1i@gmail.com> wrote:
In order to newline="\0" case to work, it should behave
similar to newline='' or newline='\n' case instead i.e., no translation should take place, to avoid corrupting embed "\n\r" characters.
The draft PEP discusses this. I think it would be more consistent to translate for \0, just like \r and \r\n.
I read the [draft]. No translation is a better choice here. Otherwise
(at the very least) it breaks `find -print0` use case.
No it doesn't. The only reason it breaks your code is that you add newline='\0' to your stdout wrapper as well as your stdin wrapper. If you just passed '', it would not do anything. And this is exactly parallel with the existing case with, e.g., trying to pass through a classic-Mac file full of '\r'-delimited strings that might contain embedded '\n' characters that you don't want to translate.
I won't repeat it several times but as you've already found out newline='\0' for stdout (at the very least) can be useful for line_buffering=True behavior.
...
There is also line_buffering parameter. From the docs:
If line_buffering is True, flush() is implied when a call to write contains a newline character.
The way this is actually defined seems broken to me; IIRC (I'll check the code later) it flushes on any '\r', and on any translated \n'. So, it's doing the wrong thing with '\r' in most modes, and with \n' in '' mode on non-Unix systems. So my thought was, just leave it broken.
Yes. I've found at least one issue http://bugs.python.org/issue22069
But now that I think about it, the existing code can only flush excessively, never insufficiently, and that's probably a property worth preserving. So maybe there _is_ a reason to pass newline for output without translation after all. In other words, the parameter may actually conflate _four_ things, not just three...
I'll need to think this through (and reread the code) this weekend; thanks for bringing it up.
-- Akira
_______________________________________________ Python-ideas mailing list Python-ideas@python.org https://mail.python.org/mailman/listinfo/python-ideas Code of Conduct: http://python.org/psf/codeofconduct/