"Paul Moore" firstname.lastname@example.org wrote:
So, damn the outside system, EXACTLY what does Python mean by such characters, and EXACTLY what uses of them are discouraged as having unspecified meanings? If we could get an answer to that precisely enough to write a parse tree with all terminals explicit, this problem would go away.
Python, the language, means nothing by the characters. They are bytes with defined values in a byte string (in 2.x, in 3.0 they are Unicode characters, but otherwise no difference). The *language* places no interpretation on them.
Actually, it's not that simple, because of the "universal newline" rule and the fact that both Unix/C ASCII and Unicode DO provide meanings for their characters, but let that pass. Your statement is not far off the situation.
Certain library functions place an interpretation on the byte values, but you need to read the function definition for that. And (a) they may not all be consistent, and (b) they may say "follows platform behaviour", but that's the way it is, so you have to live with it.
And that is why there will continue to be confusion and inconsistency, and why there will be similar threads to this for the foreseeable future. If you regard continuing problems of this sort as acceptable, then fine, but I am pointing out that they are fairly easy to avoid. But only by specifying a precise Python model.
Incidentally, the response (b) you give is a common one, but isn't usually correct when it is given. It is, after all, the cause of the problem that started this thread.
Regards, Nick Maclaren, University of Cambridge Computing Service, New Museums Site, Pembroke Street, Cambridge CB2 3QH, England. Email: email@example.com Tel.: +44 1223 334761 Fax: +44 1223 334679