Why 'r' mode anyway?

Tim Peters tim.peters at gmail.com
Sat Jan 15 01:13:13 CET 2005


[Tim Peters]
>> That differences may exist is reflected in the C
>> standard, and the rules for text-mode files are more restrictive
>> than most people would believe.

[Irmen de Jong]
> Apparently. Because I know only about the Unix <-> Windows
> difference (windows converts \r\n <--> \n when using 'r' mode,
> right).  So it's in the line endings.

That's one difference.  The worse difference is that, in text mode on
Windows, the first instance of chr(26) in a file is taken as meaning
"that's the end of the file", no matter how many bytes may follow it. 
That's fine by the C standard, because everything about a text-mode
file containing a chr(26) character is undefined.

> Is there more obscure stuff going on on the other systems you
> mentioned (Mac OS, VAX) ?

I think on Mac Classic it was *just* line end differences.  Native VAX
has many file formats.  "Record-based" file formats used to be very
popular.  There the OS saves meta-information in the file, such as
each record contains an offset to the start of the next record, and
may even contain an index structure to support random access to
records quickly (for example, "a line" may be a record, and "read the
last line" may go quickly).  Read that in binary mode, and you'll be
reading up the bits in the index and offsets too, etc.  IIRC, Unix was
actually quite novel at the time in insisting that all files were just
raw byte streams to the OS.

> (That means that the bug in Simplehttpserver that my patch
> 839496 addressed, also occured on those systems? Or that
> the patch may be incorrect after all??)

Don't know, and (sorry) no time to dig.

> While your argument about why Python doesn't use its own
> platform- independent file format is sound of course, I find it often
> a nuisance that platform specific things tricle trough into Python
> itself and ultimately in the programs you write. I sometimes feel
> that some parts of Python expose the underlying C/os
> implementation a bit too much. Python never claimed write once
> run anywhere (as that other language does) but it would have
> been nice nevertheless ;-)
> In practice it's just not possible I guess.

It would be difficult at best.  Python hides a lot of platform crap,
but generally where it's reasonably easy to hide.  It's not easy to
hide native file conventions, partly because Python wouldn't play well
with *other* platform software if it did.

Remember that Guido worked on ABC before Python, and Python is in
(small) part a reaction against the extremes of ABC.  ABC was 100%
platform-independent.  You could read and write files from ABC.
However, the only files you could read from ABC were files that were
written by ABC -- and files written by ABC were essentially unusable
by other software.  Socket semantics were also 100% portable in ABC: 
it didn't have sockets, nor any way to extend the language to add
them.  Etc -- ABC was a self-contained universe.  "Plays well with
others" was a strong motivator for Python's design, and that often
means playing by others' rules.



More information about the Python-list mailing list