Interpreter fussy about line endings?

Tim Peters tim_one at email.msn.com
Wed May 5 21:57:13 EDT 1999


[trying to run a Python source file on a Mac that was created under
 Windows; Mac Python gripes about the \r\n line endings]

[Tim]
>> Text files aren't portable in or out of the Python world.

[Greg Ewing]
> I understand that about text files in general, but
> I had expected the interpreter to be more lenient
> about what it expects as source text.

Why <wink>?  I agree it would be nice if it were.

> Sorry if my comment came across as a whinge - it wasn't
> meant to be.

No problem.

> I was genuinely surprised that the parser didn't just treat
> '\r' characters as whitespace. It seemed as though a test had been
> inserted specifically to make source code from another platform
> break, and I was wondering why this had been done.

Ah, your assumptions here are simply incorrect.  Python isn't going out of
its way to gripe about \r -- \r is just one of many illegal characters.
What Python does assume is that the platform libc normalizes text file line
endings to plain \n before Python ever sees them.  It would make the same
complaint if it found e.g. a CTRL-C before a \n.

OTOH, Python *is* going out of its way *not* to fix this!  Around line 320
of Parser/tokenizer.c, there's a block of code to change \r\n to \n that's
enclosed in a

#ifndef macintosh

block(!).  So somebody sometime thought it was a bad idea to suppress the \r
in \r\n on Macs, but a good idea on other platforms.

> Hence my question:
>
> >> Is it meant to be like this?
>
> This was a serious question: was it *designed* to
> do this, and if so, why?

Right, I knew that.  I didn't write the code, though, so my "too deep for
me" was a serious answer <wink>.

> ...
> Thinking about it more, I can see that it could be
> a bit tricky dealing with all combinations of
> CR, LF and CRLF correctly without prior knowledge of
> what platform the source came from.

I wouldn't be surprised if the next iteration of Python opened source files
in binary mode and dealt with this crap all by itself.  libc isn't helping.

> ...
> it would make sense to have a source code format that could be
> relied upon to work on any Python interpreter!

So long as Python uses C stdio with text-mode files, it's at the mercy of
what the platform libc does.  Try sticking chr(26) in a Windows .py file
sometime <wink>.

good-ideas-die-but-bad-ones-are-immortal-ly y'rs  - tim






More information about the Python-list mailing list