[Python-3000] Lines breaking
Stephen J. Turnbull
stephen at xemacs.org
Fri Jun 1 05:23:54 CEST 2007
Greg Ewing writes:
> Stephen J. Turnbull wrote:
>
> > *Python* does the right thing: it leaves the line break character(s)
> > in place. It's not Python's problem if programmers go around
> > stripping characters just because they happen to be at the end of the
> > line.
>
> But currently you *know* that, e.g. string.strip() will
> only ever remove whitespace and \n characters, so if
> those don't matter to you, it's safe to use it.
Yes. Both FF and VT *are* whitespace, AFAIK that has universal
agreement, and in particular they *are* removed by string.strip(). I
don't understand what you're worried about; nothing changes with
respect to handling of generic whitespace.
The *only* thing that adoption of the Unicode recommendation for line
breaking changes is that "\x0c\n" is now two empty lines with well-
defined semantics instead of some number of lines with you-won't-know-
until-you-ask-the-implementation semantics.
> > Those characters are mandatory breaks because the expectation is
> > *very* consistent (they say).
> I object to being told by the Unicode committee what
> semantics I should be using for ASCII characters that
> pre-date unicode by a long way.
The ASCII standard, at least as codified in ISO 646, agrees with
Unicode, by referring to ECMA-48/ISO-6249 for the definition of the 32
C0 characters. I suspect that the ANSI standard semantics of FF and
VT haven't changed since ANSI_X3.4-1963.
You just object to adopting a standard, period, because it might force
you to change your practices. That's reasonable, changing working
software is expensive. But interoperability is an important goal too.
More information about the Python-3000
mailing list