[Python-3000] Lines breaking

Stephen J. Turnbull stephen at xemacs.org
Fri Jun 1 05:23:54 CEST 2007


Greg Ewing writes:

 > Stephen J. Turnbull wrote:
 > 
 > > *Python* does the right thing: it leaves the line break character(s)
 > > in place.  It's not Python's problem if programmers go around
 > > stripping characters just because they happen to be at the end of the
 > > line.
 > 
 > But currently you *know* that, e.g. string.strip() will
 > only ever remove whitespace and \n characters, so if
 > those don't matter to you, it's safe to use it.

Yes.  Both FF and VT *are* whitespace, AFAIK that has universal
agreement, and in particular they *are* removed by string.strip().  I
don't understand what you're worried about; nothing changes with
respect to handling of generic whitespace.

The *only* thing that adoption of the Unicode recommendation for line
breaking changes is that "\x0c\n" is now two empty lines with well-
defined semantics instead of some number of lines with you-won't-know-
until-you-ask-the-implementation semantics.

 > > Those characters are mandatory breaks because the expectation is
 > > *very* consistent (they say).

 > I object to being told by the Unicode committee what
 > semantics I should be using for ASCII characters that
 > pre-date unicode by a long way.

The ASCII standard, at least as codified in ISO 646, agrees with
Unicode, by referring to ECMA-48/ISO-6249 for the definition of the 32
C0 characters.  I suspect that the ANSI standard semantics of FF and
VT haven't changed since ANSI_X3.4-1963.

You just object to adopting a standard, period, because it might force
you to change your practices.  That's reasonable, changing working
software is expensive.  But interoperability is an important goal too.


More information about the Python-3000 mailing list