As far as I know I've addressed all outstanding issues in PEP 278, http://python.sourceforge.net/peps/pep-0278.html and in the accompanying patch. -- - Jack Jansen <Jack.Jansen@oratrix.com> http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -
As far as I know I've addressed all outstanding issues in PEP 278, http://python.sourceforge.net/peps/pep-0278.html and in the accompanying patch.
I'm cautiously in favor of this, but a few more things need to be addressed. I didn't study the patch too carefully, so I'll ask: When this is disabled through the configure flag, is the 'U' mode still recognized? I think it ought to be allowed then (and mean simply text mode) so that Python code opening files in universal mode doesn't have to be prepared for that situation (it can't use the newlines attribute, but that's rarely needed I expect). Before we go ahead, I'd like MvL and MAL to have a look at the patch to see if there would be interactions with their implementation plans for PEP 262. I still think that this PEP is a big hack -- but as big hacks go, it seems to have a pretty good payback. I'm hoping that eventually the parser (really the lexer) will be able to open the file in binary mode and recognize all three newline styles directly. That would solve the problems with exec, eval, and compile. Missing: - docs for the new open mode and file attributes (!!!) - docs for the --with-universal-newlines flag in README - the linecache and py_compile modules should use mode 'U' (any others?) --Guido van Rossum (home page: http://www.python.org/~guido/)
On maandag, maart 25, 2002, at 10:16 , Guido van Rossum wrote:
I didn't study the patch too carefully, so I'll ask: When this is disabled through the configure flag, is the 'U' mode still recognized? I think it ought to be allowed then (and mean simply text mode) so that Python code opening files in universal mode doesn't have to be prepared for that situation (it can't use the newlines attribute, but that's rarely needed I expect).
Good point. You can now also use "U" mode in non-universal-newline-builds.
Before we go ahead, I'd like MvL and MAL to have a look at the patch to see if there would be interactions with their implementation plans for PEP 262.
I still think that this PEP is a big hack -- but as big hacks go, it seems to have a pretty good payback.
I'm hoping that eventually the parser (really the lexer) will be able to open the file in binary mode and recognize all three newline styles directly. That would solve the problems with exec, eval, and compile.
Missing:
- docs for the new open mode and file attributes (!!!)
Done.
- docs for the --with-universal-newlines flag in README
Done.
- the linecache and py_compile modules should use mode 'U'
Done.
(any others?)
Yes, lots of candidates, but I haven't fixed these. uu comes to mind, xmllib and htmllib and such, probably lots more... -- - Jack Jansen <Jack.Jansen@oratrix.com> http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -
I'm proposing to accept PEP 278 and let Jack check it in. There will probably be some small issues with the code (though it passes the test suite on my Linux box), but I expect the only way to tease those out is to check it in first. Jack did an admirable job of answering all questions about the patch, adding doc patches and unit tests, and so on. Unless there's significant uproar about this within 48 hours, I'll approve the PEP. One comment for Jack: I think that the 'newlines' attribute should exist even if universal newlines are not configured; it should always be None in that case. --Guido van Rossum (home page: http://www.python.org/~guido/)
"GvR" == Guido van Rossum <guido@python.org> writes:
GvR> One comment for Jack: I think that the 'newlines' attribute GvR> should exist even if universal newlines are not configured; GvR> it should always be None in that case. Minor suggestions: - when mixed newlines are found in a file, can the newlines attribute be a list of those that are found, instead of "mixed"? - shouldn't open() mode "wU" also be illegal? Other than that, +1 (it would squelch the complaints about the email package on systems who's MTA doesn't convert to native-newlines for mail program stdin). -Barry
- when mixed newlines are found in a file, can the newlines attribute be a list of those that are found, instead of "mixed"?
Good idea.
- shouldn't open() mode "wU" also be illegal?
Indeed.
Other than that, +1 (it would squelch the complaints about the email package on systems who's MTA doesn't convert to native-newlines for mail program stdin).
Indeed! --Guido van Rossum (home page: http://www.python.org/~guido/)
Minor suggestions:
- when mixed newlines are found in a file, can the newlines attribute be a list of those that are found, instead of "mixed"?
- shouldn't open() mode "wU" also be illegal?
One more: - Why isn't "rU+" allowed? It's understandable that "+" is not allowed for output, but I can't see a good reason why it shouldn't be allowed to open file for read/write, and read with universal support (maybe some implementation detail?). Allowing this would even easy the task, since when writting to the file the programmer could consider the newlines attribute, if he wants. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ]
One more:
- Why isn't "rU+" allowed? It's understandable that "+" is not allowed for output, but I can't see a good reason why it shouldn't be allowed to open file for read/write, and read with universal support (maybe some implementation detail?). Allowing this would even easy the task, since when writting to the file the programmer could consider the newlines attribute, if he wants.
This is answered by the PEP: """ A partial output implementation, where strings passed to fp.write() would be converted to use fp.newlines as their line terminated but all other output would not is far too surprising, in my view. Because there is no output support for universal newlines there is also no support for a mode "rU+": the surprise factor of the previous paragraph would hold to an even stronger degree. """ --Guido van Rossum (home page: http://www.python.org/~guido/)
This is answered by the PEP:
A partial output implementation, where strings passed to fp.write() would be converted to use fp.newlines as their line terminated but all other output would not is far too surprising, in my view.
Because there is no output support for universal newlines there is also no support for a mode "rU+": the surprise factor of the previous paragraph would hold to an even stronger degree.
I've read that, but I don't agree with the second paragraph. Universal newline support is available only for input. This sentence is enough to easily predict the behavior in every usage case, including "+" modes. If there's any intent to add support to output, then I'd understand the exclusion, since this would make backwards compatibility possible (users of "rU+" could have their code broken, since they were doing output by hand). Otherwise, it should be allowed, IMO. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ]
[PEP 278]
A partial output implementation, where strings passed to fp.write() would be converted to use fp.newlines as their line terminated but all other output would not is far too surprising, in my view.
Because there is no output support for universal newlines there is also no support for a mode "rU+": the surprise factor of the previous paragraph would hold to an even stronger degree.
[Gustavo]
I've read that, but I don't agree with the second paragraph. Universal newline support is available only for input. This sentence is enough to easily predict the behavior in every usage case, including "+" modes.
If there's any intent to add support to output, then I'd understand the exclusion, since this would make backwards compatibility possible (users of "rU+" could have their code broken, since they were doing output by hand). Otherwise, it should be allowed, IMO.
That's one possible reason not to do this. It could be added later, but I think it's better to disallow it for now. Think YAGNI. --Guido van Rossum (home page: http://www.python.org/~guido/)
On 11 Apr 2002 at 16:18, Gustavo Niemeyer wrote:
I've read that, but I don't agree with the second paragraph. Universal newline support is available only for input. This sentence is enough to easily predict the behavior in every usage case, including "+" modes.
How can you reliably update a file if there's no obvious mapping between file position and character position? -- Gordon http://www.mcmillan-inc.com/
Hi Gordon!
How can you reliably update a file if there's no obvious mapping between file position and character position?
If you're updating a text file, the relation is probably about line position, not file position. AFAICS, this would only be a problem if one wanted to insert something between CR and LF <wink>. -- Gustavo Niemeyer [ 2AAC 7928 0FBF 0299 5EB5 60E2 2253 B29A 6664 3A0C ]
On 11 Apr 2002 at 16:56, Gustavo Niemeyer wrote:
How can you reliably update a file if there's no obvious mapping between file position and character position?
If you're updating a text file, the relation is probably about line position, not file position.
That's in the logical file. In the physical file, there may be 1 or 2 character line endings. Mixed. So knowing I want line 7 doesn't help me seek to it on disk. No Windows programmer ever uses "+" without "b" without regret ;-(. -- Gordon http://www.mcmillan-inc.com/
guido wrote:
I'm proposing to accept PEP 278 and let Jack check it in.
There will probably be some small issues with the code (though it passes the test suite on my Linux box), but I expect the only way to tease those out is to check it in first.
Jack did an admirable job of answering all questions about the patch, adding doc patches and unit tests, and so on.
Unless there's significant uproar about this within 48 hours, I'll approve the PEP.
I maintain that a separate constructor (e.g. textfile) would be a much cleaner solution. it's a completely different kind of file object, after all. other than that, I see no problems with concept and implementation (you should have done it this way from the start ;-) </F>
I maintain that a separate constructor (e.g. textfile) would be a much cleaner solution.
it's a completely different kind of file object, after all.
Unclear. If files opened in regular text mode and binary mode, in read mode, write mode, append mode, and update mode, are all the same text object, I think Universal newlines are just another minor variation. Also, changing the mode is a morelocalized change (it's all in fileobject.c).
other than that, I see no problems with concept and implementation (you should have done it this way from the start ;-)
Thanks. :) --Guido van Rossum (home page: http://www.python.org/~guido/)
Unclear. If files opened in regular text mode and binary mode, in read mode, write mode, append mode, and update mode, are all the same text object, I think Universal newlines are just another minor variation.
there's enough "Mode U cannot be combined" stuff in the PEP to make me think that they're slightly more than just a minor variation. I think you can simplify the PEP somewhat by emphasizing the difference. making things simpler is never a bad idea.
Also, changing the mode is a morelocalized change (it's all in fileobject.c).
I'm willing to do the work necessary to provide an alternate factory/constructor. from what I can tell, all that has to be done is is to apply the patch, and do some fairly straight- forward refactoring. (and perhaps some slight API renaming -- the functions in there are clearly too useful to deserve names like Py_UniversalNewlineFgets -- "there is no permanent place in this world for ugly names" ;-) so +1 on checking it in as it stands, but -0 on not touching the result until 2.3 final... </F>
there's enough "Mode U cannot be combined" stuff in the PEP to make me think that they're slightly more than just a minor variation.
I think you can simplify the PEP somewhat by emphasizing the difference. making things simpler is never a bad idea.
Also, changing the mode is a morelocalized change (it's all in fileobject.c).
I'm willing to do the work necessary to provide an alternate factory/constructor. from what I can tell, all that has to be done is is to apply the patch, and do some fairly straight- forward refactoring.
(and perhaps some slight API renaming -- the functions in there are clearly too useful to deserve names like Py_UniversalNewlineFgets -- "there is no permanent place in this world for ugly names" ;-)
so +1 on checking it in as it stands, but -0 on not touching the result until 2.3 final...
OK, good. I'll leave this between you & Jack then. I'll try to make sure you two actually do something about the open items before 2.3 is released! --Guido van Rossum (home page: http://www.python.org/~guido/)
On donderdag, april 11, 2002, at 10:01 , Fredrik Lundh wrote:
so +1 on checking it in as it stands, but -0 on not touching the result until 2.3 final...
I'm +1 on reading the second half of that sentence as a statement that you volunteer to do this:-) (Though I still don't understand why you think a separate constructor is somehow cleaner than a new mode flag, even after reading your rationale. But I have enough other interesting things to do so I don't want to argue this point with you, you can have it if you do the work:-) -- - Jack Jansen <Jack.Jansen@oratrix.com> http://www.cwi.nl/~jack - - If I can't dance I don't want to be part of your revolution -- Emma Goldman -
participants (7)
-
barry@zope.com
-
Fredrik Lundh
-
Gordon McMillan
-
Guido van Rossum
-
Gustavo Niemeyer
-
Jack Jansen
-
Jack Jansen