Writing a Carriage Return in Unicode

Scott David Daniels Scott.Daniels at Acm.Org
Fri Nov 20 08:22:22 CET 2009


MRAB wrote:
> u'\u240D' isn't a carriage return (that's u'\r') but a symbol (a visible
> "CR" graphic) for carriage return. Windows programs normally expect
> lines to end with '\r\n'; just use u'\n' in programs and open the text
> files in text mode ('r' or 'w').

<rant>
This is the one thing from standards that I believe Microsoft got right
where others did not.  The ASCII (American Standard for Information
Interchange) standard end of line is _both_ carriage return (\r) _and_
line feed (\n) -- I believe in that order.

The Unix operating system, in its enthusiasm to make _everything_
simpler (against Einstein's advice, "Everything should be made as simple
as possible, but not simpler.") decided that end-of-line should be a
simple line feed and not carriage return line feed.  Before they made
that decision, there was debate about the order of cr-lf or lf-cr, or
inventing a new EOL character ('\037' == '\x1F' was the candidate).

If you've actually typed on a physical typewriter, you know that moving
the carriage back is a distinct operation from rolling the platen
forward; both operations are accomplished when you push the carriage
back using the bar, but you know they are distinct.  Hell, MIT even had
"line starve" character that moved the cursor up (or rolled the platen
back).
</rant>

Lots of people talk about "dos-mode files" and "windows files" as if
Microsoft got it wrong; it did not -- Unix made up a convenient fiction
and people went along with it. (And, yes, if Unix had been there first,
their convention was, in fact, better).

So, sorry for venting, but I have bee wanting to say this in public
for years.

--Scott David Daniels
Scott.Daniels at Acm.Org



More information about the Python-list mailing list