[Python-Dev] Triple-quoted strings and indentation

Andrew Durdin adurdin at gmail.com
Mon Jul 11 05:27:53 CEST 2005


On 7/11/05, Josiah Carlson <jcarlson at uci.edu> wrote:
>
> You are wrong.  Current string literals are explicit.  They are what you
> type.

No they are not:

    >>> "I typed \x41, but got this!"
    'I typed A, but got this!'

What we have are not explicit string literals but *explicit rules*,
forming part of the language definition and given in the documentation
that certain sequences in string literals are preprocessed by the
compiler.  People have learnt these rules and apply them unconsciously
when reading the source--but that can apply to any rule. For example,
there's another explicit rule, that an "r" prefixed before the string
literal uses a different set of rules without \-escape sequences:

    >>> r"I typed \x41, but got this!"
    'I typed \\x41, but got this!'

The point is that processing \-escape sequences is just as implicit as
my proposal: but tradition, and custom, and documentation make it
*seem* explicit of itself. IOW, arguing that my proposal is "implicit"
or "DWIM" is neither relevant nor valid--but arguing either that it is
confusing to a long-term Python user, or that it is not fully backward
compatible is valid, and these (and other) arguments against should be
weighed up against those in favour. (Even some fairly recent and major
changes to Python have been accepted despite having these two
particular arguments against them, such as unified classes/types and
nested scopes).

> When you have to start differentiating, or consider differentiating, how
> preprocessing occurs based on the existance or non-existance of escaped
> newlines, you should realize that this has a serious "Do what I mean"
> stink (as Guido has already stated, more politely).

What I am considering differentiating on here is a feature of Python
that is (at least) awkward and (at most) has a "serious stink" -- the
ability to escape newlines in a single-quoted [' or "] string with a
backslash, which has inconsistent or confusing behaviour:

    >>> "This is a normal string\
    ... with an escaped newline."
    'This is a normal stringwith an escaped newline.'

    >>> r"This is a raw string\
    ... with an escaped newline."
    'This is a raw string\\\nwith an escaped newline.'

This is not an issue with TQS's because they can naturally (i.e.
without escapes) span multiple lines.


Since your main objections above are much the same as Guido's, I'll
respond to his in this message also:

On 7/11/05, Guido van Rossum <gvanrossum at gmail.com> wrote:
>
> The scheme may be explicitly spelled out in the language reference,
> but it is implicit to the human reader -- what you see is no longer
> what you get.

See discussion of explicitness vs. implicitness above.

> I recommend that you give it up. You ain't gonna convince me.

Very likely. But given the number of times that similar proposals have
been put forth in the past, it is reasonable to expect that they will
be brought up again in the future by others, if this is rejected--and
in that case, these other can simply be pointed to a thorough (but
rejected) PEP that discusses the proposal and variants and reasons for
rejection.

And so--while I still hope that you can be convinced (there's
precedent ;-), I think a good, thorough PEP will be of benefit even if
rejected. And, of course, such a PEP is bound to be more convincing
than a hasty ill-considered one. So I am rewriting my previous draft
accordingly, and will submit it as a PEP when it's done.

Cheers,

Andrew.


More information about the Python-Dev mailing list