[Python-Dev] New lines, carriage returns, and Windows
Nick Maclaren
nmm1 at cus.cam.ac.uk
Sat Sep 29 11:25:29 CEST 2007
"Paul Moore" <p.f.moore at gmail.com> wrote:
>
> OK, so far so good - although I'm not *quite* sure there's a
> self-consistent definition of "code that only uses \n". I'll assume
> you mean code that has a concept of lines, that lines never contain
> anything other than text (specifically, neither \r or \n can appear in
> a line, I'll punt on whether other weird stuff like form feed are
> legal), and that whenever your code needs to write data to a file, it
> writes lines with \n alone between them.
I won't. There are a few of us still left who know how this started,
and here is a simplified description.
Unix was a computer scientist's workbench, and made no attempt to be
general. In particular, its text datastream model was appropriate
for the imnportant devices of the day - teletypes and similar. So
far, so good. But what was forgotten later is that the model does
NOT extend to other systems and, in particular, made no sense on the
record-oriented models generally used by mainframes (see Fortran for
an example).
When C was standardised, this was fudged. I tried to get it improved,
but it is one of the many things I failed to do. The handling of
ALL of the control characters in text I/O is non-portable (even \t,
despite what the satndard says), and you have to follow the system's
constraints if things are to work. Unfortunately, the kludging that
the compiler does to map C to the operating system confuses things
still further - though it is essential.
Now, BCPL was an ancestor of C, but always was a more portable
language (i.e. it didn't start with a specific operating system in
mind), and used/uses a rather better model. In this, line separators
are atomic - e.g. '\f' is newline-with-form-feed and '\r' is
"newline-with-overprinting". Now, THAT model is more generic.
Not fully generic, of course, but it would cater for all of Unix,
CPM and its derivatives (yes, Microsoft), MacOS and most mainframes
(with some reservations).
So, until and unless Python chooses to define its own I/O model,
these problems will continue to arise. Whether this one is a simple
bug or an avoidable feature, I can't say without looking harder,
but bugs are often caused by attempting to implement impossible
or confusing specifications.
Regards,
Nick Maclaren,
University of Cambridge Computing Service,
New Museums Site, Pembroke Street, Cambridge CB2 3QH, England.
Email: nmm1 at cam.ac.uk
Tel.: +44 1223 334761 Fax: +44 1223 334679
More information about the Python-Dev
mailing list